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We present a method for reliably determining the lowest energy structure of an atomic cluster 
in an arbitrary model potential. The method is based on a genetic algorithm, which operates 
on a population of candidate structures to produce new candidates with lower energies. Our 
method dramatically outperforms simulated annealing, which we demonstrate by applying the 
genetic algorithm to a tight-binding model potential for carbon. With this potential, the algorithm 
efficiently finds fullerene cluster structures up to Ceo starting from random atomic coordinates. 



Advances in computer technology have made molecular 
dynamics simulations more and more popular in studying 
the behavior of complex systems. Even with modern-day 
computers, however, there are still two main limitations 
facing atomistic simulations: system size and simulation 
time. While recent developments in parallel computer 
design and algorithms have made considerable progress 
in enlarging the system size that can be accessed using 
atomistic simulations, methods for shortening the simu- 
lation time still remain relatively unexplored. 

One example where such methods will be useful is in 
the determination of the lowest energy configurations of a 
collection of atoms. Because the number of candidate lo- 
cal energy minima grows exponentially with the number 
of atoms, the computational effort scales exponentially 
with problem size, making it a member of the ./VP-hard 
problem class 0]. In practice, realistic potentials de- 
scribing covalently bonded materials possess significantly 
more rugged energy landscapes than the two-body poten- 
tials addressed by the authors of Ref. , further increas- 
ing the difficulty. Attempts to use simulated annealing 
to find the global energy minimum in these systems are 
frustrated by high energy barriers which trap the simu- 
lation in one of the numerous metastable configurations. 
Thus an algorithm is needed which can 'hop' from one 
minimum to another and permit an efficient sampling of 
phase space. 

In this letter, we will describe the application of such 
an algorithm to the concrete example of determining the 
ground state structure of small atomic clusters. The most 
interesting clusters are those which lie in the transition 
range between molecules and bulk matter. These are 
precisely the ones which can be expected to have un- 
usual structures which are unrelated to either the bulk 
or molecular limits. For a few atoms, the ground state 
can sometimes be found by a brute force search of con- 
figuration space. For up to ten or twenty atoms, de- 
pending upon the potential, simulated annealing may be 
employed to generate some candidate ground state con- 



figurations g. For more atoms than this, the simulation 
time required to find the minimum by simulated anneal- 
ing is usually prohibitive, because evaluations of the po- 
tential and forces are too expensive. In this regime one 
is left with judicious guessing of likely candidate ground 
state structures. 

Our approach is based on the genetic algorithm (GA), 
an optimization strategy inspired by the Darwinian evo- 
lution process |3). Starting with a population of candi- 
date structures, we relax these candidates to the nearest 
local minimum. Using the relaxed energies as the crite- 
ria of fitness, a fraction of the population is selected as 
"parents." The next generation of candidate structures 
is produced by "mating" these parents. The process is 
repeated until the ground state structure is located. 

We have applied this algorithm to optimize the geom- 
etry of carbon clusters up to Ceo- In a U cases we studied, 
the algorithm efficiently finds the ground state structures 
starting from an unbiased population of random atomic 
coordinates. This performance is very impressive since 
carbon clusters are bound by strong directional bonds 
which result in large energy barriers between different iso- 
mers. Although there have been many previous attempts 
to generate the Cgo buckyball structure from simulated 
annealing, none has yielded the ground state structure 
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Method - Before presenting our results, we will de- 
scribe our genetic algorithm procedure in more detail. 
The choice of mating procedure is the central choice one 
must make in constructing a genetic algorithm. In an 
efficient algorithm, it should impart important proper- 
ties of the parent clusters to the children. A common 
choice |Q is to first map the physical structure onto a 
binary number string, then use string recombination as a 
mating procedure. Such an approach has been applied to 
optimize the packing structure of small molecular clusters 
and the conformation of some molecules ||] . We found 
that it is not very efficient, however, when used to opti- 
mize the geometry of atomic clusters. This is because the 



1 



mating operation does not preserve the characteristics of 
the parents. 

In the present work, we represent an atomic cluster by 
the list of N atomic cartesian coordinates Xj in arbitrary 
order, 

Q = {xi,x 2 ,...,xjv}. (1) 

Our mating operator P : P(Q, Q') — > Q" performs the 
following action upon two parent geometries Q and Q' 
to produce a child Q" . First, we choose a random plane 
passing through the center of mass of each parent cluster. 
We then cut the parent clusters in this plane, and assem- 
ble the child Q" from the atoms of Q which lie above the 
plane, and the atoms of Q' which lie below the plane. If 
the child generated in this manner does not contain the 
correct number of atoms, the parent clusters are trans- 
lated an equal distance in opposing directions normal to 
the cut plane so as to produce a child Q" which contains 
the correct number of atoms. 

Relaxation to the nearest local minimum is per- 
formed with conjugate-gradient minimization or molecu- 
lar dynamics quenching. Typically, about 16 conjugate- 
gradient steps or about 30 molecular dynamics steps are 
applied to a new geometry before a decision is made 
whether further optimization is warranted. 

We preferentially select parents with lower energy from 
{G}. The probability p(G) of an individual candidate 
Q to be selected for mating is given by the Boltzmann 
distribution 

p(G) cx exp(-£(£)/T m ), (2) 

where E(Q) is the energy per atom of the candidate Q 1 
and the mating 'temperature' T m is chosen to be roughly 
equal to the range of energies in {Q}. 

In some cases, described in the next section, we found 
it necessary to apply mutations to members of the pop- 
ulation. We define a mutation operator M : M(Q) — » Q' 
which performs one of two functions with equal probabil- 
ity. The first mutation function moves the atoms in Q a 
random distance (of the same order as a bond length), in 
a random direction, a random number of times (between 
5 and 50), while separating unphysically close atoms be- 
tween each step. The second mutation function imple- 
ments a simple search for an adjacent watershed in the 
potential energy hypersurface. We employ an algorithm 
which takes a random number of steps in atomic co- 
ordinate space. At each step the algorithm changes di- 
rection so as to maintain travel along a direction slightly 
uphill to an equipotential line. The result of this is gen- 
erally a high-energy cluster, but one which lies in an ad- 
jacent watershed region of E({x}). 

We maintain a population {Q} of p candidates, and 
create subsequent generations as follows. Parents are 
continuously chosen from {0} with probability given by 
Eq. (0) and mated using the mating procedure described 
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FIG. 1. Generation of the Ceo molecule, starting from ran- 
dom coordinates, using the genetic algorithm described by 
the text with 4 candidates (p — 4) and no mutation (fj, = 0). 
The energy per atom is plotted for the lowest energy (solid 
line) and highest energy (dashed line) candidate structure in 
{<?} as a function of the number of genetic mating operations 
P (see text) that have been applied. Several of the inter- 
mediate structures which contain defects are illustrated at 
top: (a) contains one 12-membered ring and two 7-membered 
rings, (b) contains a 7-membered ring, (c) contains the correct 
distribution of pentagons and hexagons, but two pentagons 
are adjacent. The ideal icosahedral buckyball structure is 
achieved shortly after 5000 genetic operations. 

above. A fraction /i of the children generated in this 
way are mutated; fi = means no mutation occurs. The 
(possibly mutated) child is relaxed to the nearest local 
minimum and selected for inclusion in the population if 
its energy is lower than another candidate in {G}- 

This procedure requires the algorithm to keep track of 
a large number of candidates in {{?}, since the popula- 
tion generally becomes filled with almost identical low- 
energy candidates. These duplicated efforts reduce the 
algorithm's efficiency. To prevent this, we introduce an 
energy resolution SE, and allow new entries to {§} only 
if there are no other candidates already in {Q} whose 
energy is within SE of the new entry's energy. 

Results - To illustrate the method, we use a tight- 
binding model for carbon, described elsewhere This 
potential accurately describes the energetics of fullerene 
structures. 

Fig. [l] shows the model potential energy of the lowest 
and highest energy Ceo cluster in {Q} versus the num- 
ber of genetic mating operations performed with no mu- 
tation (/i = 0) on a population of p = 4 candidates, 
starting from coordinates chosen at random. We used 
a mating temperature T m = 0.2 eV/atom, and an en- 
ergy resolution SE — 0.01 eV/atom. This cluster is too 
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large for unbiased simulated annealing [B| to arrive at 
the correct global minimum (the icosahedral buckmin- 
sterfullerene cage). As Fig. |l| illustrates, the genetic al- 
gorithm correctly generates the cage after roughly 5000 
mating operations. 

Fig. |l| illustrates several generic features of the algo- 
rithm. During the initial few generations, the energy 
drops very quickly and the population soon consists of 
reasonable candidates, similar to what would be observed 
with simulated annealing. This initial period is usually 
a small fraction of the total time spent by the algorithm. 
The rest of the time is spent in an end game, where the 
remaining defects in the structure are removed (Fig. |l| 
(a) - (c)). The general behavior of the genetic algorithm 
is remarkably resistant to changes in the details of the 
algorithm. The Cgo cage is found reliably over a wide 
range of values of the mating temperature T m , number of 
candidates p, and the number of conjugate-gradient opti- 
mizations performed upon each application of O. In ad- 
dition, the use of schemes other than Eq. (||) for selecting 
parents from {Q} also leads to the correct final answer. 
For example, we tried using equal mating probabilities 
p(Q) for all candidates regardless of energy, as well as a 
probability linear in the energy. All of these variations 
produced genetic algorithms which worked satisfactorily. 

In cases with several competing low energy states, it 
is sometimes advantageous to investigate the minimiza- 
tion of a number of "ecologies," that is, to repeat the 
above process with different starting populations. For 
example, in smaller clusters of carbon atoms, a bimodal 
mass spectrum has been observed in laser vaporization 
experiments [jLOj, and this has been interpreted jllj as 
evidence that two regimes of Cjv cluster growth exist: for 
N < 25, mono- and polycyclic rings are formed, while for 
N > 25, fullerene cages are formed. Thus, for clusters 
around this size, there is a competition between cage- 
like, ring-like and cap-like structures. Searches for the 
global energy minimum must surmount the difficulty of 
becoming trapped in one of these structural classes. 

Figs. || and |3| show the results of running the genetic 
algorithm on C20 and C30 clusters, using the same pa- 
rameters p = 4 and T m = 0.2 eV/atom that were used 
to generate the Ceo cage. The solid line in Fig. || il- 
lustrates the generic result for C20 when no mutation 
is used (/x = 0). The lowest energy structure for C20 
in the model potential is a polycyclic cap with energy 
—8.671 eV/atom, and the fullerene cage structure is not 
far above, with energy —8.613 eV/atom. Nevertheless, 
only a small fraction of the \i = genetic algorithm ecolo- 
gies find one of these structures within 4000 genetic oper- 
ations. Instead, the ecologies get 'trapped' in monocyclic 
rings with energy —8.503 eV/atom (Fig. || (lc)). The cap 
and the cage structures can be found for C20, however, 
if we include mutations in our algorithm or, equivalently, 
by using molecular dynamics annealing for the relaxation 
process. For example, with fi — 0.05, about 25% of the 
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FIG. 2. Running the genetic algorithm on C20- The solid 
line shows the generic lowest energy structure when the algo- 
rithm is run with no mutation (pi = 0); the structures (la) 
- (lc) are present in the population at the times indicated. 
Essentially all ecologies get trapped in monocyclic rings (lc). 
The dashed line (structures (2a) - (2c)) and the dot-dashed 
line (structures (3a) and (3b)) illustrate the results when mu- 
tation is added (fj, = 0.05). 

ecologies find the polycyclic cap (Fig. ||, broken lines). 

In the case of C30, the lowest energy structure in the 
model potential is a fullerene cage, and roughly 80% of 
the /i = ecologies find it within 4000 genetic opera- 
tions. The remaining 20% form cages, but not quickly 
enough to find the fullerene (Fig. [|, solid line). With 
mutations, convergence to the fullerene cage is greatly 
increased. Essentially all of the ^ = 0.05 ecologies find 
the C30 cage within 4000 genetic operations (Fig. |[ bro- 
ken lines). The role of mutation in the algorithm is to 
allow searches for alternate structural classes. Referring 
to Fig. H one sees precipitous drops in energy when a 
new class of candidate is discovered. In the case of C30, 
the cage structural class appears even with n = but 
is more efficiently reduced to the perfect structure when 
H ± 0. 

We emphasize that mutation by itself does not effi- 
ciently lower the energy of a population. We found that 
application of the mutation operator M in the absence 
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FIG. 3. Running the genetic algorithm on C30. The solid 
line shows the lowest energy structure when the algorithm 
is run with no mutation (fi — 0) for an ecology that failed 
to find the minimum energy configuration (a fullerene cage) 
within 4000 genetic operations. The structures (la) - (lc) 
are present in the population at the times indicated. The 
structure (lc) resulting after 4000 genetic operations is a cage, 
and is eventually reduced to the perfect fullerene cage even 
with n — 0. The broken lines illustrate two /j, — 0.05 ecologies 
which arrive at the perfect cage (2b) via distinct routes (2a), 
(3a). 

of mating leads to a drastic decrease in the efficiency of 
the optimization process. 

Discussion - Like simulated annealing, the genetic al- 
gorithm requires repeated evaluation of the energy and 
forces within the model potential. The higher efficiency 
of the genetic algorithm, however, allows convergence to 
low-energy candidates in larger clusters than is possible 
with simulated annealing. We are currently applying the 
method to larger carbon clusters and will present those 
results elsewhere [Q . In addition, we have applied the al- 
gorithm to systems other than carbon clusters, and our 
preliminary findings indicate that the algorithm is effi- 
cient over a broad class of structural optimization prob- 
lems. For example, we have successfully applied the 
method to bulk and surface geometries, with a suitably 
modified mating operator P. 

The efficiency of the present algorithm may be in- 
creased in special cases when the class of desired struc- 
tures is assumed, and a more complicated mapping be- 
tween the genetic representation (genotype) and the clus- 
ter structure (phenotype) could be employed. For in- 
stance, in the case of the larger carbon fullerene clusters 



we expect that a representation in terms of a face-dual 
model Jl^ l would lead to rapid convergence, since only 
cage structures would be investigated. 

While the artificial dynamics of the genetic algorithm 
cannot be expected to reproduce the natural annealing 
process in which atomic clusters are formed, we found 
that the intermediate structures located by the genetic 
algorithm on its way to the ground state structure are 
very similar to the results of simulated annealing. Thus it 
appears that the same kinetic factors which influence the 
annealing process also affect the ease with which a partic- 
ular candidate is generated by the genetic algorithm. If 
this is true, the genetic algorithm results presented here 
can be viewed as analogous to those of greatly extended 
conventional simulated annealing runs. More work needs 
to be done to determine if this is indeed the case. 
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